female voice
SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement
Ge, Yuan, Zhang, Junxiang, Liu, Xiaoqian, Li, Bei, Ma, Xiangnan, Wang, Chenglong, Ye, Kaiyang, Du, Yangfan, Zhang, Linfeng, Huang, Yuxin, Xiao, Tong, Yu, Zhengtao, Zhu, JingBo
Speech-to-Speech (S2S) Large Language Models (LLMs) are foundational to natural human-computer interaction, enabling end-to-end spoken dialogue systems. However, evaluating these models remains a fundamental challenge. We propose \texttt{SageLM}, an end-to-end, multi-aspect, and explainable speech LLM for comprehensive S2S LLMs evaluation. First, unlike cascaded approaches that disregard acoustic features, SageLM jointly assesses both semantic and acoustic dimensions. Second, it leverages rationale-based supervision to enhance explainability and guide model learning, achieving superior alignment with evaluation outcomes compared to rule-based reinforcement learning methods. Third, we introduce \textit{SpeechFeedback}, a synthetic preference dataset, and employ a two-stage training paradigm to mitigate the scarcity of speech preference data. Trained on both semantic and acoustic dimensions, SageLM achieves an 82.79\% agreement rate with human evaluators, outperforming cascaded and SLM-based baselines by at least 7.42\% and 26.20\%, respectively.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.93)
Conversations with Andrea: Visitors' Opinions on Android Robots in a Museum
Heisler, Marcel, Becker-Asano, Christian
-- The android robot Andrea was set up at a public museum in Germany for six consecutive days to have conversations with visitors, fully autonomously. No specific context was given, so visitors could state their opinions regarding possible use-cases in structured interviews, without any bias. Additionally the 44 interviewees were asked for their general opinions of the robot, their reasons (not) to interact with it and necessary improvements for future use. The android's voice and wig were changed between different days of operation to give varying cues regarding its gender . This did not have a significant impact on the positive overall perception of the robot. Most visitors want the robot to provide information about exhibits in the future, while opinions on other roles, like a receptionist, were both wanted and explicitly not wanted by different visitors. Speaking more languages (than only English) and faster response times were the improvements most desired. These findings from the interviews are in line with an analysis of the system logs, which revealed, that after chitchat and personal questions, most of the 4436 collected requests asked for information related to the museum and to converse in a different language. The valuable insights gained from these real-world interactions are now used to improve the system to become a useful real-world application. An android robot's outer appearance is explicitly designed to resemble a human as closely as possible.
- Asia > Japan (0.05)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Personal > Interview (1.00)
No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Fucci, Dennis, Gaido, Marco, Negri, Matteo, Cettolo, Mauro, Bentivogli, Luisa
Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role. This can result in disparities in recognition accuracy between male and female speakers, primarily due to the under-representation of the latter group in the training data. While in the context of hybrid ASR models several solutions have been proposed, the gender bias issue has not been explicitly addressed in end-to-end neural architectures. To fill this gap, we propose a data augmentation technique that manipulates the fundamental frequency (f0) and formants. This technique reduces the data unbalance among genders by simulating voices of the under-represented female speakers and increases the variability within each gender group. Experiments on spontaneous English speech show that our technique yields a relative WER improvement up to 9.87% for utterances by female speakers, with larger gains for the least-represented f0 ranges.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Northern Europe (0.04)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- (2 more...)
Amazon's Alexa is accused of sexism after being unable to give the result of the Lionesses' World Cup semi-final because it didn't know the match had taken place
Amazon's virtual assistant Alexa has been accused of sexism after being unable to respond to a question about the Lionesses' World Cup semi-final. British academic Dr Joanne Rodda asked Alexa for the result of Wednesday's match against Australia, which England won 3-1. But the supposedly'smart' technology didn't even know the match had taken place as it was only familiar with the men's game, the BBC reports. Astonishingly, when Dr Rodda asked'for the result of the England-Australia football match', Alexa said there was no such match. Amazon admitted the mistake was due to an'error' – although it didn't specify the cause – and that Alexa will get better at learning over time.
Amazon is making a HUGE change to Alexa's voice - here's what it means for your smart assistant
Amazon has revealed a huge change that will make interacting with its smart speakers a lot less fun. The tech giant is retiring all three celebrity voices for its smart speakers – Samuel L. Jackson, Shaquille O'Neal and Melissa McCarthy. Amazon offered the superstar voices for $4.99 each as an alternative to Alexa, but these are no longer available for purchase on its website. Amazon, which released its fifth generation Echo Dot smart speaker last year, said customers can contact them for a refund. The feature was for US users only, although the tech giant does offer alternative voices for its smart assistant in the UK, such as Santa Claus.
- Information Technology (0.61)
- Leisure & Entertainment (0.53)
- Media > Film (0.37)
Why Fake Drake and AI-Generated Music Are Here to Stay
Lauren: So you are the editor in chief here at WIRED, and you've been talking a lot about AI, so I wanted to see how good you are at telling regular human-made music apart from AI-generated music. Gideon: I mean, I can barely tell music by one human apart from another sometimes. So, uh, you know, you might be disappointed, but I will do my best. Lauren: Let's hear the second one. Lauren: First, I'm curious if you know who the artist is. Gideon: I have no idea.
- Media > Music (0.96)
- Leisure & Entertainment (0.96)
- Information Technology > Communications > Mobile (0.40)
- Information Technology > Artificial Intelligence (0.38)
What I've Learned After 26 Rides In A Driverless Cruise Robotaxi
About three weeks ago, I received an email with an access code for an app that allowed me to take rides in a robotaxi. The app comes from Cruise, a startup that was acquired by General Motors in 2016 for just under a billion dollars. With it, I could now use the Cruise robotaxis, which have been operating in San Francisco since August 2021, for myself. What makes them special is that these robotaxis drive driverless. There is no one in the car when it picks up passengers. Thanks to a friend who had been given access to the app a few months earlier, I was able to make my first two trips as early as the beginning of July. I reported on that, especially because two coyotes had crossed our path.
- North America > United States > California > San Francisco County > San Francisco (0.27)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- Europe (0.04)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Robotics & Automation (1.00)
- Automobiles & Trucks (1.00)
In South Korea, robots are on the job. So how is the service?
I met my first South Korean robots as I checked into the Henn na Hotel in Seoul at the end of a 21-hour journey from the U.S.: two plane flights and a bleary-eyed ride on the transit rail. Behind the front desk stood two gleaming white androids, with big round heads framing green digital eyes and thin green smiles. I headed for the androids. The robot clerk on the right came alive to greet me -- first in English, then in Korean, Japanese, and Chinese, in quick succession. "Welcome to the Henn na Hotel!" it said in a chirpy female voice. It was eerily humanoid yet inhuman, with hands that looked like white-fingered gloves and thin black mechanical joints for elbows. Its cartoony face was drawn for friendliness. Its slender arms occasionally swept outward in a welcoming gesture.
- Asia > South Korea > Seoul > Seoul (0.26)
- Asia > North Korea (0.14)
- Asia > Japan > Kyūshū & Okinawa > Kyūshū > Nagasaki Prefecture > Nagasaki (0.05)
- (6 more...)
- Health & Medicine (0.94)
- Consumer Products & Services > Restaurants (0.94)
12 Amazing Facts About AI - Simple Programmer
Artificial Intelligence (AI) is a branch of computer science that helps build smart machines. AI provides data that makes these machines capable enough to match human intelligence. As a result, many industries have taken advantage of AI technologies. Machine Learning and Deep Learning are the two subsets of Artificial Intelligence. Whereas machine learning refers to computers able to think and act with less human intervention, deep learning involves computers able to use structures modeled on the human brain.
- Oceania > New Zealand (0.05)
- Oceania > Australia (0.05)
- North America > Canada (0.05)
- (3 more...)
- Information Technology (0.49)
- Banking & Finance > Trading (0.31)
Apple launches iOS 15.4 update with pregnant man emoji and gender-neutral voice for Siri
Apple has finally rolled out its much-anticipated iOS 15.4 update, allowing iPhone users to unlock their smartphone while wearing a mask. The update also includes 37 new emoji, including a pregnant man, a motorcycle tyre, a slide, a disco ball, a troll with a club, coral, kidney beans and a low battery. There's also a new'gender neutral' voice for its smart assistant Siri, called Quinn, recorded by a member of the LGBTQ community. 'iOS 15.4 offers the ability to use Face ID while wearing a mask, a new Siri voice option, expanded language support for Visual Lookup, new emoji, and much more,' Apple said. Here's a look at the key features in iOS 15.4, which is available now on the iPhone 16s or later.
- Information Technology (0.49)
- Health & Medicine > Therapeutic Area > Immunology (0.30)
- Information Technology > Communications > Mobile (1.00)
- Information Technology > Communications > Social Media (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)